Dataset consist of scientific articles from 3 different journals:
| # Articles before preprocessing | # Articles after preprocessing | |
|---|---|---|
| EIST | 683 | 574 |
| RSOG | 659 | 639 |
| Sus-Sci | 1191 | 1121 |
import json
import pandas as pd
data_path = 'data/extract_EIST.json'
with open(data_path, 'r') as fd:
data = json.load(fd)
df = pd.DataFrame(data).T
df.head()
| file_name | doi | title | abstract | text | location | year | authors | |
|---|---|---|---|---|---|---|---|---|
| 1 | -It-s-not-talked-about---The-risk-of-failure-_... | 10.1016/j.eist.2020.02.008 | “It's not talked about”: The risk of failure i... | Scholars of sustainability transition have giv... | {'Introduction': ' A transition away from the ... | UK | 2020 | [Beck Collins] |
| 2 | -Making-energy-transition-work---Bricolage-_20... | 10.1016/j.eist.2020.07.005 | “Making energy transition work”: Bricolage in ... | In the quest for energy transition pathways, e... | {'Introduction': ' Local energy transitions ha... | Austria | 2020 | [Johannes Suitner, Martha Ecker, T U Wien] |
| 3 | 1-s2.0-S2210422419302618-main | 10.1016/j.eist.2019.10.005 | Thinking about individual actor-level perspect... | The 2019 STRN research agenda identifies conne... | {'Introduction: background and rationale': ' T... | Germany | 2020 | [Paul Upham, Paula Bögel, Elisabeth Dütschke] |
| 4 | 1-s2.0-S2210422419302850-main | 10.1016/j.eist.2019.11.008 | Not more but different: A comment on the trans... | The sustainability transitions research networ... | {'Introduction': ' The comprehensive agenda fo... | UK | 2020 | [Debbie Hopkins, Johannes Kester, Toon Meelen,... |
| 5 | 1-s2.0-S2210422420300277-main | 10.1016/j.eist.2020.02.001 | Let's focus more on negative trends: A comment... | Much has been written on sustainability transi... | {'Introduction': ' The analysis of sustainabil... | UK | 2020 | [Miklós Antal, Giulio Mattioli, Imogen Rattle,... |
df.loc[1,'abstract']
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) File ~/miniconda3/envs/ify/lib/python3.9/site-packages/pandas/core/indexes/base.py:3802, in Index.get_loc(self, key, method, tolerance) 3801 try: -> 3802 return self._engine.get_loc(casted_key) 3803 except KeyError as err: File ~/miniconda3/envs/ify/lib/python3.9/site-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc() File ~/miniconda3/envs/ify/lib/python3.9/site-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc() File pandas/_libs/hashtable_class_helper.pxi:5745, in pandas._libs.hashtable.PyObjectHashTable.get_item() File pandas/_libs/hashtable_class_helper.pxi:5753, in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 1 The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) Cell In[16], line 1 ----> 1 df.loc[1,'abstract'] File ~/miniconda3/envs/ify/lib/python3.9/site-packages/pandas/core/indexing.py:1066, in _LocationIndexer.__getitem__(self, key) 1064 key = tuple(com.apply_if_callable(x, self.obj) for x in key) 1065 if self._is_scalar_access(key): -> 1066 return self.obj._get_value(*key, takeable=self._takeable) 1067 return self._getitem_tuple(key) 1068 else: 1069 # we by definition only have the 0th axis File ~/miniconda3/envs/ify/lib/python3.9/site-packages/pandas/core/frame.py:3924, in DataFrame._get_value(self, index, col, takeable) 3918 engine = self.index._engine 3920 if not isinstance(self.index, MultiIndex): 3921 # CategoricalIndex: Trying to use the engine fastpath may give incorrect 3922 # results if our categories are integers that dont match our codes 3923 # IntervalIndex: IntervalTree has no get_loc -> 3924 row = self.index.get_loc(index) 3925 return series._values[row] 3927 # For MultiIndex going through engine effectively restricts us to 3928 # same-length tuples; see test_get_set_value_no_partial_indexing File ~/miniconda3/envs/ify/lib/python3.9/site-packages/pandas/core/indexes/base.py:3804, in Index.get_loc(self, key, method, tolerance) 3802 return self._engine.get_loc(casted_key) 3803 except KeyError as err: -> 3804 raise KeyError(key) from err 3805 except TypeError: 3806 # If we have a listlike key, _check_indexing_error will raise 3807 # InvalidIndexError. Otherwise we fall through and re-raise 3808 # the TypeError. 3809 self._check_indexing_error(key) KeyError: 1
data_path = 'data/prepro_EIST.json'
with open(data_path, 'r') as fd:
data = json.load(fd)
df = pd.DataFrame(data)
df.head()
df['text'][0]
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img = mpimg.imread('disp/3journals_optim.png')
plt.imshow(img)
plt.axis('off')
(-0.5, 575.5, 431.5, -0.5)
from IPython.display import IFrame
IFrame(src='disp/topic_network_3_journals_antons.html', width=900, height=700)
df = pd.read_csv('disp/centrality_full_network.csv')
df.sort_values(by=['Degree Centrality'], ascending=False).head(7)
| Unnamed: 0 | Topic | Degree Centrality | Degree per Article | Betweenness Centrality | Betweenness per Article | Clustering | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 0_interview_data_conducted_participant | 59.430657 | 0.716032 | 2213.512033 | 26.668820 | 0.130918 |
| 5 | 5 | 5_transformation_actor_change_transition | 38.277372 | 0.797445 | 806.733079 | 16.806939 | 0.180654 |
| 9 | 9 | 9_emission_scenario_reduction_carbon | 36.262774 | 0.614623 | 936.007673 | 15.864537 | 0.163492 |
| 2 | 2 | 2_complexity_system_approach_process | 29.211679 | 0.561763 | 470.744724 | 9.052783 | 0.182266 |
| 137 | 137 | 137_game_approach_process_change | 25.182482 | 1.144658 | 213.092696 | 9.686032 | 0.303333 |
| 4 | 4 | 4_area_land_water_scenario | 23.167883 | 0.413712 | 288.320436 | 5.148579 | 0.245059 |
| 15 | 15 | 15_forest_land_scenario_deforestation | 23.167883 | 0.772263 | 252.689720 | 8.422991 | 0.245059 |
import plotly.express as px
df = pd.read_csv('disp/edge_weight_dist_full_network.csv')
fig = px.pie(df, values='%', names='Edge Weight')
fig.show()
df = pd.read_csv('disp/3_journals_topics.csv')
df = df.rename(columns={'Volumne': 'Volume'})
df.head()
| Unnamed: 0 | Topic Label | topic_nr | most_freq_words | rep_doc_year | title | Volume | Authors | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0_interview_data_conducted_participant | 0 | ['interview', 'data', 'conducted', 'participan... | 2020 | Sharing among neighbours in a Norwegian suburb | 37 | Westskog H., Aase T.H., Standal K., Tellefsen S. |
| 1 | 2 | 1_university_student_school_stanford | 1 | ['university', 'student', 'school', 'stanford'... | 2010 | Chapter 23: The Stanford organizational studie... | 28 | Meyerson D.E. |
| 2 | 3 | 2_complexity_system_approach_process | 2 | ['complexity', 'system', 'approach', 'process'... | 2020 | SHIFT IN HYBRIDITY IN RESPONSE TO ENVIRONMENTA... | 69 | Ramus T., Vaccaro A., Versari P., Brusoni S. |
| 3 | 4 | 3_sustainability_research_student_science | 3 | ['sustainability', 'research', 'student', 'sci... | 2021 | The patterns of curriculum change processes th... | 16.0 | Weiss M., Barth M., von Wehrden H. |
| 4 | 5 | 4_area_land_water_scenario | 4 | ['area', 'land', 'water', 'scenario', 'forest'... | 2019 | The seasonal and scale-dependent associations ... | 14.0 | Aiba M., Shibata R., Oguro M., Nakashizuka T. |
plt.figure(figsize = (35,30))
img = mpimg.imread('disp/3_journals.png')
plt.imshow(img, aspect='auto')
plt.axis('off')
(-0.5, 1999.5, 1499.5, -0.5)
from IPython.display import IFrame
IFrame(src='disp/hierarchical_topics_eist.html', width=1000, height=1000)
from IPython.display import IFrame
IFrame(src='disp/hierarchical_topics_rsog.html', width=1000, height=1000)
from IPython.display import IFrame
IFrame(src='disp/hierarchical_topics_sus_sci.html', width=1000, height=1200)
from IPython.display import IFrame
IFrame(src='disp/hierarchical_topics_3_journals.html', width=1000, height=1400)
df = pd.read_csv('disp/descriptive_stats_3_journals.csv')
df.head()
| Unnamed: 0 | Topic Label | standardized_mean | max | min | |
|---|---|---|---|---|---|
| 0 | 0 | 0_interview_data_conducted_participant | 3.637 | 0.560 | 0.0 |
| 1 | 1 | 1_university_student_school_stanford | 1.789 | 0.951 | 0.0 |
| 2 | 2 | 2_complexity_system_approach_process | 1.542 | 0.630 | 0.0 |
| 3 | 3 | 3_sustainability_research_student_science | 1.447 | 0.863 | 0.0 |
| 4 | 4 | 4_area_land_water_scenario | 3.686 | 0.854 | 0.0 |
df = pd.read_csv('disp/3_journals_temp_dev_trajc.csv')
df.head()
| Unnamed: 0 | topic_label | count | year_mean | year_std | year_min | year_max | coeff_linear | coeff_quadratic | coeff_linear_of_quadratic | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0_interview_data_conducted_participant | 83 | 2015.153846 | 5.446893 | 2001 | 2022 | 0.7577 | 0.0673 | 0.7577 |
| 1 | 1 | 1_university_student_school_stanford | 17 | 2015.500000 | 3.905125 | 2010 | 2021 | -0.8770 | 0.1167 | -0.8770 |
| 2 | 2 | 2_complexity_system_approach_process | 52 | 2016.083333 | 4.050892 | 2009 | 2022 | 0.4452 | 0.0382 | 0.4452 |
| 3 | 3 | 3_sustainability_research_student_science | 36 | 2015.692308 | 4.120952 | 2009 | 2022 | 0.2857 | 0.0407 | 0.2857 |
| 4 | 4 | 4_area_land_water_scenario | 56 | 2015.076923 | 4.730613 | 2007 | 2022 | 0.4114 | 0.0731 | 0.4114 |
import numpy as np
df['year_mean'] = np.around(df['year_mean'], 3)
df['year_std'] = np.around(df['year_std'], 3)
df.drop(['coeff_linear_of_quadratic'], axis=1).to_csv('3_journals_temp_dev_trajc.csv')
from IPython.display import IFrame
IFrame(src='disp/3_journals_hot.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/3_journals_cold.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/3_journals_reviving.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/3_journals_evergreen.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/3_journals_wallflowers.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/continous_color.html', width=900, height=600)
from IPython.display import IFrame
IFrame(src='disp/colab_network.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/within_form_colab_network.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/outside_form_colab_network.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/new_discourse.html', width=1000, height=600)
from IPython.display import IFrame
IFrame(src='disp/network_with_new_cluster.html', width=1000, height=600)
#(0.33+-0.02, 0.33+-0.02, 0.33+-0.02)
from IPython.display import IFrame
IFrame(src='disp/network_with_interstitial_cluster.html', width=1000, height=600)
"""
Associational :
'social responsibility ,common good ,civil society ,core values ,humanitarianism ,altruism ,empowerment ,
political action ,positive action ,self interest ,moral character ,betterment ,social harmony ,advocates ,
acceptance ,accountability ,ethical behavior ,human decency ,ideals ,society ,compassion ,social stability ,
principles ,aims ,self interest ,conformity ,moral integrity ,individualism ,human dignity ,notion ,free thought ,
personal integrity ,ideology ,continued existence ,nonviolence ,individual freedom ,autonomy ,benevolence ,
noble goal ,non violence ,perpetuation ,advocacy ,religious faith ,social order ,furthering ,moral values ,
personal responsibility ,basic human ,personal agency ,very existence ,better society ,personal freedom ,discourse ,
governance ,social structure ,social change ,rationality ,social cohesion ,notions ,well being ,selfishness ,
ostensibly ,group identity ,institution ,impetus ,consequence ,morals ,human rights ,individuality ,
collective action ,personal benefit ,anathema ,activism ,strong belief ,value system ,inaction ,own sake ,
justification ,morality ,democratic values ,moral standing ,greater good ,pragmatism ,reaffirmation ,respect ,
immorality ,social equality ,social advancement ,inclusiveness ,subservience ,social reform ,undermining ,
individual autonomy ,power structure ,moral duty ,personal sense ,social movement ,undermine ,human nature ,
personal power ,accountability ,advancement ,advocacy ,awareness ,care ,charity ,commitment ,common good
compassion ,democracy ,development ,elimination ,emotion ,empowerment ,eradication ,ethics ,justice ,
make a difference ,mission ,moral ,motivation ,participation ,principles ,quality of life ,relationship ,
rights based ,social benefit ,social change ,social movement ,social progress ,solidarity ,trust ,values ,
vision ,voice'
"""
"""
Scientific :
'methodology ,statistical analysis ,analysis ,statistical data ,inference ,empirical data ,experimental data ,
extrapolation ,underlying assumptions ,predictive power ,abstract ,hypothesis ,correlations ,experimental results ,
analyses ,real data ,inferences ,empirical evidence ,observations ,quantification ,quantifiable data ,
statistical significance ,actual data ,methodologies ,empirically ,hypotheses ,available information ,validity ,
prior research ,empirical ,statistical methods ,statistical ,empirical research ,findings ,observation ,
quantitative data ,experimental design ,statistics ,evaluation ,data set ,basic assumptions ,scientific data ,
scientific study ,scientific analysis ,actual results ,heuristic ,supposition ,entire study ,first principles ,
criterion ,scientific process ,such studies ,objective data ,mathematical model ,observational data ,metrics ,
results ,regression analysis ,conclude ,logical reasoning ,objective analysis ,causal relationships ,study design ,
hard data ,quantifying ,statistical models ,qualitative data ,research findings ,demonstrate ,sufficient data ,
data sets ,data point ,concretely ,hypothesis testing ,theoretical model ,previous research ,relevant data ,
confidence intervals ,methods ,particular study ,p values ,extrapolations ,existing research ,Bayesian ,real world ,
study ,scientific results ,available evidence ,data points ,statistical model ,implications ,counterfactuals ,
empirical results ,quantified ,real world ,basis ,extrapolating ,assertion ,analysis ,assessment ,causality ,
control group ,correlation ,counterfactual ,criteria ,data ,design ,eligible population ,evaluation ,evidence ,
experiment ,framework ,identification strategy ,indicators ,informed philanthropy ,logical framework model ,
means of verification ,measurement ,measures ,meta analysis ,methodology ,proven strategy ,quantification ,
randomized control trials ,review ,root cause ,social impact analysis ,statistically significant ,survey ,tactics ,
target group ,theory of change ,treatment effects'
"""
"""
Managerial :
'efficiency ,profitability ,metrics ,scalability ,efficiencies ,productivity ,optimization ,trade offs ,performance ,
resource allocation ,optimizing ,available resources ,cost/benefit analysis ,optimize ,business case ,
considerations ,HunterSmith ,cost benefit ,overall system ,cost benefit ,terms ,current approach ,inefficiencies ,
current level ,robustness ,overall benefit ,optimise ,implementation ,capabilities ,efficiency gains ,future growth ,
improved performance ,incrementally ,cost savings ,significant impact ,energy efficiency ,cost-benefit analysis ,
tangible benefits ,leveraging ,balancing act ,minimal impact ,overall performance ,important factors ,
expected performance ,significant cost ,other considerations ,resource use ,long-term stability ,prioritisation ,
constraints ,improvements ,bottom line ,optimising ,increased efficiency ,allocation ,further development ,
decision process ,marginal benefit ,trade off ,business process ,feasibility ,resource consumption ,sustainability ,
cost/benefit ratio ,crucially ,incentives ,prioritization ,processes ,long-term viability ,system performance ,
current environment ,key factor ,usability ,power use ,risk analysis ,actual goal ,little benefit ,system ,
marginal benefits ,performance metrics ,cost/benefit ,existing systems ,acceptable level ,factor ,potential impact ,
actual results ,long-term growth ,cost effectiveness ,competence ,capability ,intended goal ,critical part ,
complexity ,ROI ,effort ,immediate benefit ,administrative overhead ,benchmarks ,best practice ,bottom line ,
capacity ,certification ,constituent satisfaction ,cost benefit ,earned income ,effectiveness ,efficiency ,
exist strategy ,growth ,impact ,key performance indicators ,lessons learned ,leverage ,management ,market based ,
milestones ,monitoring and evaluation ,objectives ,optimization ,outcome ,output ,performance ,productivity ,
return on investment ,smart giving ,stakeholder satisfaction ,strategic ,SWOT ,transparency ,value proposition ,
venture philanthropy'
"""
### List of Users: InnoEnergyEU, EUeic, EITUrbanMob, EITRawMaterials, EITManufactur, EITHealth, EITFood, EITeu,
### EIT_Digital, ClimateKIC
from IPython.display import IFrame
IFrame(src='disp/InnoEnergyEU.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EUeic.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EITUrbanMob.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EITRawMaterials.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EITManufactur.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EITHealth.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EITFood.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EITeu.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/EIT_Digital.html', width=700, height=600)
from IPython.display import IFrame
IFrame(src='disp/ClimateKIC.html', width=700, height=600)
state_vocab = """Sovereignty, Constitution, National Security, Foreign Relations, Diplomacy, International Law, Human Rights, Civil Liberties, Public Services, Infrastructure, Public Health, Public Safety, Social Security, Social Welfare, Public Education, Public Transportation, Taxation, Fiscal Policy, Regulatory Framework"""
state_vocab = [word.strip() for word in state_vocab.split(',') if len(word)>0]
state_vocab
['Sovereignty', 'Constitution', 'National Security', 'Foreign Relations', 'Diplomacy', 'International Law', 'Human Rights', 'Civil Liberties', 'Public Services', 'Infrastructure', 'Public Health', 'Public Safety', 'Social Security', 'Social Welfare', 'Public Education', 'Public Transportation', 'Taxation', 'Fiscal Policy', 'Regulatory Framework']
market_vocab = """Profits, Competition, Consumers, Supply Chain, Distribution, Pricing, Mergers & Acquisitions, Outsourcing, Globalization, Innovation, Technology, Intellectual Property, Risk Management, Branding, Advertising, Market Research, Market Share, Market Segmentation, Market Trends, Market Analysis"""
market_vocab = [word.strip() for word in market_vocab.split(',') if len(word)>0]
market_vocab
['Profits', 'Competition', 'Consumers', 'Supply Chain', 'Distribution', 'Pricing', 'Mergers & Acquisitions', 'Outsourcing', 'Globalization', 'Innovation', 'Technology', 'Intellectual Property', 'Risk Management', 'Branding', 'Advertising', 'Market Research', 'Market Share', 'Market Segmentation', 'Market Trends', 'Market Analysis']
community_vocab = """Volunteers, Charities, Social Groups, Local Organizations, Non-Governmental Organizations, Faith-Based Organizations, Community Centers, Neighborhoods, Clubs, Activists, Advocates, Social Movements, Social Enterprises, Social Networks, Social Media, Fundraisers, Donors, Philanthropy, Collaboration, Empowerment, Inclusion, Social Justice"""
community_vocab = [word.strip() for word in community_vocab.split(',') if len(word)>0]
community_vocab
['Volunteers', 'Charities', 'Social Groups', 'Local Organizations', 'Non-Governmental Organizations', 'Faith-Based Organizations', 'Community Centers', 'Neighborhoods', 'Clubs', 'Activists', 'Advocates', 'Social Movements', 'Social Enterprises', 'Social Networks', 'Social Media', 'Fundraisers', 'Donors', 'Philanthropy', 'Collaboration', 'Empowerment', 'Inclusion', 'Social Justice']